Fingerprinting Biomedical Terminologies – Automatic Classification and Visualization of Biomedical Vocabularies through UMLS Semantic Group Profiles

نویسندگان

  • Bastien Rance
  • Thai Le
  • Olivier Bodenreider
چکیده

OBJECTIVES To explore automatic methods for the classification of biomedical vocabularies based on their content. METHODS We create semantic group profiles for each source vocabulary in the UMLS and compare the vectors using a Euclidian distance. We explore several techniques for visualizing individual semantic group profiles and the entire distance matrix, including donut pie charts, heatmaps, dendrograms and networks. RESULTS We provide donut pie charts for individual source vocavularies, as well as a heatmap, dendrogram and network for a subset of 78 vocabularies from the UMLS. CONCLUSIONS Our approach to fingerprinting biomedical terminologies is completely automated and can easily be applied to all source vocabularies in the UMLS, including upcoming versions of the UMLS. It supports the exploration, selection and comparison of the biomedical terminologies integrated into the UMLS. The visualizations are available at (http://mor.-nlm.nih.gov/pubs/supp/2015-medinfo-br/index.html).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Network Visualization of UMLS Source Vocabularies using Semantic Groups

Exploring UMLS source vocabularies can be challenging for non-specialists. In this poster, we generated network graphs to visualize the content of medical terminologies. We built a graph containing UMLS source vocabularies and semantic groups, where edges are created between a source and its significant semantic group(s). Significance was calculated using the frequency of the semantic groups in...

متن کامل

Automatic Classification and Visualization of UMLS Source Vocabularies through Semantic Group Profiles

The Unified Medical Language System® (UMLS) is a comprehensive terminology integration system designed to support the development of electronic information systems. The UMLS integrates 161 source vocabularies, though for a given purpose, a developer may not need every vocabulary. With the breadth of vocabularies available, there is a need for classifying the UMLS source vocabularies with respec...

متن کامل

Combining terminologies and ontologies to integrate biomedical information

The post genomics era is characterized by huge amounts of biomedical information, distributed in multiple databanks (e.g. SWISS-PROT, OMIM, LocusLink, GenBank, as well as many others). Despite recent efforts to provide standard ontologies such as Gene Ontology, semantic heterogeneity is a major obstacle to information integration. Each databank has its own identifiers for genes and gene product...

متن کامل

Exploitation of semantic similarity for adaptation of existing terminologies within biomedical area

We present a novel method for adaptation of existing terminologies. Within biomedical domain and when no textual corpora for building terminologies are available, we exploit UMLS metathesaurus which merges over a hundred existing biomedical terminologies and ontologies. We exploit also algorithms for measuring the semantic similarity in order to limit, within UMLS, a semantically homogeneous sp...

متن کامل

Toward automating an inference model on unstructured terminologies: OXMIS case study.

Most modern biomedical vocabularies employ some hierarchical representation that provides a "broader/narrower" meaning relationship among the "codes" or "concepts" found within them. Often, however, we may find within the clinical setting the creation and curation of unstructured custom vocabularies used in the everyday practice of classifying and categorizing clinical data and findings.A signi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 216  شماره 

صفحات  -

تاریخ انتشار 2015